Goto

Collaborating Authors

 Freising



PROSPECT: LabeledTandemMassSpectrometry DatasetforMachineLearninginProteomics

Neural Information Processing Systems

PROSPECT provides value to proteomics and machine learning researchers by including several high-quality annotations and by being accessible in terms of format and structure for applying machinelearning.



Conversational no-code and multi-agentic disease module identification and drug repurposing prediction with ChatDRex

Süwer, Simon, Bagemihl, Kester, Baier, Sylvie, Dicunta, Lucia, List, Markus, Baumbach, Jan, Maier, Andreas, Delgado-Chaves, Fernando M.

arXiv.org Artificial Intelligence

Repurposing approved drugs offers a time-efficient and cost-effective alternative to traditional drug development. However, in silico prediction of repurposing candidates is challenging and requires the effective collaboration of specialists in various fields, including pharmacology, medicine, biology, and bioinformatics. Fragmented, specialized algorithms and tools often address only narrow aspects of the overall problem, and heterogeneous, unstructured data landscapes require specialized users to be involved. Hence, these data services do not integrate smoothly across workflows. With ChatDRex, we present a conversation-based, multi-agent system that facilitates the execution of complex bioinformatic analyses aiming for network-based drug repurposing prediction. It builds on the integrated systems medicine knowledge graph NeDRex. ChatDRex provides natural language access to its extensive biomedical KG and integrates bioinformatics agents for network analysis and drug repurposing, complemented by agents for functional coherence evaluation for in silico validation, as well as agents for literature mining and for discussing the obtained results in a scientific context. Its flexible multi-agent design assigns specific tasks to specialized agents, including query routing, data retrieval, algorithm execution, and result visualization. A dedicated reasoning module keeps the user in the loop and allows for hallucination detection. By enabling physicians and researchers without computer science expertise to control complex analyses in natural language, ChatDRex democratizes access to bioinformatics as an important resource for drug repurposing. It enables clinical experts to generate hypotheses and explore drug repurposing opportunities, ultimately accelerating the discovery of novel therapies and advancing personalized medicine and translational research.


FUSE: Fast Semi-Supervised Node Embedding Learning via Structural and Label-Aware Optimization

Chakraborty, Sujan, Bordoloi, Rahul, Sengupta, Anindya, Wolkenhauer, Olaf, Bej, Saptarshi

arXiv.org Artificial Intelligence

Graph-based learning is a cornerstone for analyzing structured data, with node classification as a central task. However, in many real-world graphs, nodes lack informative feature vectors, leaving only neighborhood connectivity and class labels as available signals. In such cases, effective classification hinges on learning node embeddings that capture structural roles and topological context. We introduce a fast semi-supervised embedding framework that jointly optimizes three complementary objectives: (i) unsupervised structure preservation via scalable modularity approximation, (ii) supervised regularization to minimize intra-class variance among labeled nodes, and (iii) semi-supervised propagation that refines unlabeled nodes through random-walk-based label spreading with attention-weighted similarity. These components are unified into a single iterative optimization scheme, yielding high-quality node embeddings. On standard benchmarks, our method consistently achieves classification accuracy at par with or superior to state-of-the-art approaches, while requiring significantly less computational cost.



9 Appendix Supplementary material for the paper Causal analysis of 19 spread in Germany

Neural Information Processing Systems

W in V, W is independent of V\ ( Descendants(W) Parents( W)) given Parents (W) . As expected we see that the number of detected causes by Granger is multiple times more than those of SyPI; in most cases Granger detects as causes all the candidate states. On the other hand, SyPI does not suffer from such problems even when there are latent confounders. Finally, in the third column, we report the detected distant causes. Strict thresholds (the default of SyPI method) are used for the analysis.



Scaling behavior of large language models in emotional safety classification across sizes and tasks

Pinzuti, Edoardo, Tüscher, Oliver, Castro, André Ferreira

arXiv.org Artificial Intelligence

Understanding how large language models (LLMs) process emotionally sensitive content is critical for building safe and reliable systems, particularly in mental health contexts. We investigate the scaling behavior of LLMs on two key tasks: trinary classification of emotional safety (safe vs. unsafe vs. borderline) and multi-label classification using a six-category safety risk taxonomy. To support this, we construct a novel dataset by merging several human-authored mental health datasets (> 15K samples) and augmenting them with emotion re-interpretation prompts generated via ChatGPT. We evaluate four LLaMA models (1B, 3B, 8B, 70B) across zero-shot, few-shot, and fine-tuning settings. Our results show that larger LLMs achieve stronger average performance, particularly in nuanced multi-label classification and in zero-shot settings. However, lightweight fine-tuning allowed the 1B model to achieve performance comparable to larger models and BERT in several high-data categories, while requiring <2GB VRAM at inference. These findings suggest that smaller, on-device models can serve as viable, privacy-preserving alternatives for sensitive applications, offering the ability to interpret emotional context and maintain safe conversational boundaries. This work highlights key implications for therapeutic LLM applications and the scalable alignment of safety-critical systems.